\section{Generating Tokens}
\label{sec:tok}
To transform a pages content into a format which is better suited for working with, the class \textit{Tokenizer} have been created.

The \textit{Tokenizer} converts a string to a list of tokens. To make the tokens, the input string is converted into an array of words via the string.Split method, which in this case splits the input string on white space. Next the stopwords are removed from this list of tokens. A stopword are words used as filling in a sentence. This is done in order to reduce the storing space of the tokens by removing stopwords. These stopwords are taken from the www.ranks.nl list of stopwords. 